Report R-210 



COMPUTER PROCHAM SYNTHESIS BASED OH 
STATISTICAL COMMUNICATION THEORY 



Submitted to the 

OFFICE OF NAVAL RESEARCH 
Under Contract N5ori-O6002 

Project NR 232-001 



by 
Abraham Katz 



DIGITAL COMPUTER LABORATORY 
MASSACHUSETTS INSTITUTE OF TECHNOLOQI 
Cambridge 399 Massachusetts 
Die 6782 

September 28, 1952 
(Thesis Date: September 28, 19^1) 



Report R-.210 



FOREWORD 

Because it presents information of interest to those studying 
the application of digital computers to the real-time control of 
physical :^stems, this revised thesis investigation is now being issued 
as an R-series report by the Digital Computer Laboratory at M»I.T« 

In such servomechanisms applications, it may frequently be 
desirable, when computer capacity is available, that the computer be pro- 
grammed to abstract or otherwise modify the intelligence from a noigy input 
signal* In this mode of operation, the computer simulates the conventional 
filter or compensating network. This report concerns itself with methods 
for optimizing, in a mean square error sense, computer programs which can 
effect such a filtering process. 

The author is indebted to Professor W,K, Linvill for his super- 
vision of this research, and to the Digital Computer Laboratory for the 
free use of its facilities and for the interest and advice of many of its 
personnel. 



11 



ABSTBACT 

Since the use of the large-scale calcalating machine as an element in a 
servomechanisB is heing actively studied^ procednres mast he derised for the 
specification of conrpiiter programs which vill enahle the conpater to abstract 
intelligence from an incoming signal seqaence. Here we treat the synthesis 
of programs (i.e., discrete filters) which process the isput sequence in such 
a manner as to obtain the best possible approximation, in a least square error 
sense, to a specified function of that sequence. The synthesis procedures are 
based on equations analogous to the VienesvHbpf equation. By means of these 
equations it is possible to specify the optimum least-squares filter whether 
it be linear or not, time-varying or not. In any case, the optimom filter is 
obtained by solving a set of simaltaneous linear algebraic equations. 
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CHAPIBR I 
IHTROIXTOTION 

1.01 Historical Hote 

In recent years large°»scale digital compaters ha^e 'been brou^t to 
that stage of development where their explication to the control of dynamic 
systems is practically realizable. Computer applications, in general, may 
he divided roughly into the following categories^ 

a) real-time •> that in which the computer mast cope 
with a dynamically changing prohlem, and in which 
the coBiputed results directly affect the evolution 
of the prohlem (e.g. > chemical process control, 
simulation) , and 
h) nox^real time - that in which the coiaputer copes with 
a static or quasi^static pro'blera, and in which the 
computed results have little or nd effect on the 
e9o3iition of the problem (e.g. » scientific oomputa-> 
tions, economic analysis). 
A real time application to a complex control prohlem freqp.ently implies 
a computer having a very high operating speed as well as rather exten- 
sive storage facilities. One such computer, Whirlwind I, is "being cur- 
rently developed "by the M. I.5« Digital Computer Laboratory. It should "be 
noted that the use of the digital computer in control systems is "but a 
logical extension of servomechanisms applications. 
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▲ssaming that the compater and its associated conversiozi a^aratus 
are properly designed with respect to the particular control system at hand, 
the question which then arises is the fonnolation of computer programs which 
will accomplish the desired results. Tiewed in proper perepective, the 
digital computer can he visualized as a docile slave with a very strong hack 
but a rather f eehle mind» Although plagued hy a small "memory" and hy a 
limited repertoire of arithmetic and logical "thought processes" , this stlave 
is nevertheless capable of consulting this "memory" and executing these pro- 
cesses at truly fantastic speeds. It remains for the applications engineer 
to instruct the slave in the manner In which "he" is to perform "his" duties^ 

Generally these comptLter programs are formulated in terms of dif- 
ference equations (i.e. - time domain synthesis) with the computer solving 
the equations by means of arithmetic operations and modifying its course 
of actions by means of logical operations. In a recent doctoral thesis, 
Salsser ^ presents a method for the coding of programs which depends on 
frequency domain synthesis. Since the computer program is essentially a 
data-»processing filter, one would logically expect that many of the methods 
of analog filter theory are extensible to program synthesis. This thesis 

constitutes an attempt at program synthesis based on the concepts of Wiener- 

167 
Lee theory. ' '' In a sense, it is an effort at endowing the digital com- 
puter with a higher order of intelligence — one capable of interpreting 
a message in terms of its statistical distributions in time. 

In his classic monograph, "Extrapolation, Interpolation, and 
Smoothing of Stationary Time Series", F» Wiener demonstrated the basic 
unity of the previously unrelated philosophies underlying the investigation 
of time series in statistics and of message^transmission in communication 
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engineering* This study parallels to a great extent the somewhat earlier 
work of A. Eolmogoroff and thus represents another instance of "nmltiple 
discovery" y that carious historical phenomenon which has recurred so fre- 
quently in the development of science. Although each of these men was 
concerned with a slightly different prohlem, hoth sought, through an ex- 
tension of the theory of stochastic processes, to establish the hases for 
the optimum prediction of time series. 

Time series may he defined as sequences, discrete or continuous, 
of quantitative data assigned to specific moments of time* Since the 
majority of types of time-variations encountered, hoth in statistics and 
in communications, are not of the regular functional type in which the 
function f (t) can he represented exactly "by a mathematical function of 
t^ they can he studied only with respect to the statistics of their dis- 
trihutions in time# Prom this it follows that the separation of the true 
data from the noise in any set of sequences must logically he preceded hy 
a study of the statistical characteristics of the set. Since the con- 
cepts of statistics are based on large collections or ensembles of events, 
we remark that the performances of the filters considered herein are to 
he optimized over an ensemble of possible signals* 

By making certain restrictive assumptions as to the properties 
of the ensemble, Wiener was able to simplify the specification of the 
optinum filter. The basic concept of his theory was that communication 
signals were to be regarded as stationary, ergodlc time series. A sta- 
tionary ensemble is one whose statistical properties do not vary with 
time — that is, the statistical regularities of the past will hold in 
the future* An ergodic ensemble is a special case of a stationary 
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ensemble for which any member ftinctlon is statistically representative 
of any other memher function* For the ergodic ensemble one can, in 
compiting statistical parameters, replace ensemMe averages by time aver- 
ages* This replacement permits a considerable saving in time and effort 
since the former averaging process is mach more difficult than the latter. 
Furthermore, the assoniption of stationarity permits the design of the 
optimom filter in the form of a constant-coefficient system. 

If the concept of treating messages as stationary time series 
is accepted, the problem vhich then arises is — what statistical param- 
eters are of importance in the design of optimum filters? In this, as 
in most physical problems, the solution is more readily obtained if the 
limitation of linearity is imposed on the types of operations to be used 
in the filtering* With this limitation on the characteristics of the 
filters, it is immediately apparent that the only statistical parameters 
which need be considered are the signal power spectra — or their time 
domain equivalents, the linear correlation functions* It should be noted 
that if the limitation of linearity is not imposed, then the statistical 
parameters of importance are both the linear and the higher order non- 
linear correlation functions* This point will be treated in somewhat 
greater detail later (c«f* Chap. II). 

Prior to the publication of Wiener's monograph, the proble® 
of filter design was genersdly handled by the classical methods of 
(a) prescribing either the desired frequency response or the desired 
transient response, and (b) synthesis, ng a network which approximates 
this response. Hot only did Wiener suggest an entirely different ap- 
proach to the synthesis of linear filters, but he also provided the basis 



for the deTolopmeat of a theory of infomation*-^* '-^ Mach work has since 

heen done in the interpretation *' ' and extension"* * of the Wiener 

12 
theory sjul in the mechanization of the rather laborious procedure of 

cosiputing correlation functions* 

It should he emphasized that this optimization theoiy need not 
he restricted to stationary ensemhles nor the filtering system to linear 
operations. Booton -^ has generalized the theory to include time-vary- 
ing llnecLr systems with nonstationary statistical inputs. We shall show 
that, for the "discrete filter" 9 one can specify the optimum system with 
the same ease regardless of whether the restrictions of linearity and 
stationarity are in^osed or not* 

1*02 Definition of Prohlem 

Haying sketched, the origin of the statistical communication 
theory, we now consider the extension of its techniques to the synthesis 
of programs capahle of dealing with a specific kind of noise* The noise 
in question is that arising from the coxFersion of a continuous time 
series to one which is discrete* 

Information that is to he processed "bj a digital computer must 
generally he made arailahle to it in the form of signals which are dis- 
crisftized hoth in time of occurrence and in magnitude* Thus, in a control 
application, a continuously varying signal which is related to the con- 
trolled variahle of the system may he sas^led and quantized hy means of 
encoding devices to give one of a finite set of discrete magnitudes for 
each distinct time interval, Hote that the term "sampling* is used for 
the process of discretizing in time, "quantizing" is used for that in 



amplitude, and "encoding" is used for the process (involving sampling 

and qn£uitizing) where'by a continuous time series is converted to one 

which is discretized in time and amplitude* 

Whether the original input signal is noise-free or not, it 
i 
'is distorted in the process of heing encoded. This distortion is caused 

primarily lay the fact that the quantizer can reproduce ozdy approximately 

an instantaneous sample hy assigning to it the value of the nearest 

quantizing level. To attain an accuracy greater than one part in a 

thousand requires a rather complex electronic system. The result is that 

this analog-to-digital conversion hy the encoder produces data which are 

good to, say, three decimal figures. When these sampled quantized data 

are subsequently used in the computation of a correction to the controlled 

variable of the system, the inevitahle accamalation of round-off errors 

soon destroys the usefulness of the computatiouo Eound-off error stems 

from the necessity of operating on data which approximate all real 

numhers hy rational numhers with a finite numher of digits* 

The prohlem at hand is to synthesize discrete filters (i.e», com- 

puter programs) which, when supplied with a noisy data input, will yield 

the host possible approximation, in a least mean-square error sense, to 

the message or some function thereof* As kfi example — if we assume that 

quantization noise is the predominant noise component, then, for the 

register length of the Whirlwind I computer, one might he called upon to 

specify that system which will process incoming g-digit data in such a 

way as to obtain the host possible approximation (in the specified sense) 

to a 15-digit value of some given function of the input data* 
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1»03 Discassion of Proposed Method of Solution 

Having fornmlated our pro'blem and Indicated the technique that will 
he used to ohtain a solution, we now consider the limitations iinpliclt in the 
assamption that the criterion of performance shall he the minimization of the 
mean 8<]uare error over the ensemble* Whether this is indeed the sense in 
which a system should be optimized is by no means certain. In establishing 
suitable performance criteria, eiigineers are concerned with questions of 
values rather than questions of fact. !?he absence of valid criteria which 
lead to solvable problems requires that the engineer make some arbitrary 
decision as to the ^criterion of goodness** of a system. 

In formalating his theory of statistical prediction, Wiener was 
confronted by this problem and arbitrarily decided to use the criterion of 
least square error. Of all qaantities which lend themselves to an easy 
minimization, the most natural are those which are inherently positive be» 
cause they are sqaares of some simple real expression or soms of such squares. 
Wiener was well aware that this criterion had serious faults. Firstly, it 
puts an over->emphasi s on those points where the predicted auid actual values 
differ by a large amountj and secondly, it gives a considerable weight to 
small errors occurring with great frequency over a loiig interval of time. 
Thas we find that this criterion is overly sensitive at both ends of the scale 
of magnitude of error, and slights intermediate values of error. 

If one examines this criterion objectively, one finds that, in ad- 
dition to having the considerable virtue of leading to a mathematically 
manageable problem, it frequently leads to rather good results. A well»known 

fact is that the application of the Wiener-Lee theory to the filtering of a 

1^ 
Gaussian signal leads to a design which is absolutely optimal. Zadeh has 
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soggested that a more appropriate criterion of design would "be the maximiza- 
tion of the probability that the error at a prespecified time, t , he less 
than some prescribed tolerance^ The maximization procedure leads to a set 
of eqaations whose solution in practice mast he carried out by trial and 
error* If the nature of the system is such that certain conditions are satis- 
fied, then the probability criterion yields the same values for the design 
constants as does the least square error criterion. These conditions re- 
quire that either (a) the systematic error as well as the geometric mean of 
the tolerance and the systematic error at time t = t be small in comparison 
with the r.m.s, value of the random error, or (b) the systematic error be 

large in comparison with the random error, and the tolerance be approximately 

15 
equal to the systematic error* IPloyd ^ has shown that the criterion of maxi» 

mization of a probability density is the equivalent of the criterion of least 

squares when the bias errors for an ensemble of signals have a normal dis- 

11 
tribution about zero* Stutt, after making an experimental study of "op- 

timam" filters, concluded that least square error network specifications are 
generally sensible and lead to designs which are relatively non-critical* 

The crux of the problem of defining suitable performance criteria 
lies in the fact that there is little understanding as to what properties 
of a message are actually utilized by the ^lltimate receptors and therefore 
should be retained in the output of the filter* It is obvious that the 
stationary time series which the filter should handle is a function of the 
destination* In the absence of information concerning the message-utilizing 
properties of the ultimate receptors we must content ourselves with an over- 
all system which is not optimal, even in the least sqaare error sense* 



Report £-210 -9- 

The ex:treme siinplicity of the least sqtiares procednre can he 
illustrated hj the follovlng exBo&plo: 

Let it he required that ve construct a linear, constcmt-coefflcient 
filter for which 

i. = input at time t, 

a. = desired output at t = t, 

o, = actual output at t =s t, 

M 

= IZ Vk-n 
nr=o 

where A = coefficients hy which input data are to he 



and 



weighted 
^ k ~ ^ "" \ " error at t = tj^ 



£2 
^ and can be minimized 

hy a proper choice of the weighting coefficients, A , The conditions for a 



minimom are 
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K . = for n = 0, 1, ..♦, IT. 



Bit 



3€k _ ^^k ^ c ^^k _ ^ 



or 



C^ ^k-on ~ ^ ^°' n = 0, 1, ••• M« 
This is the discrete fora of the Wiener^Bbpf equation for the linear constant- 
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coefficlent filter* Expanding this equation^ ve find that the optismm set 
of coefficients is determined hy the equations 



^ ^n S:-n^fc.B. = *k S>.« 



SFO 

for ffl = 0,1, ••• , M 

The reader will find that the aboTe descrihed procedare or variaF- 
tions thereof appear throughout this report* Understanding of this siaiple 
exaiople enables one to specify rigorously aa optimum design for eoaiputer 
programs which are to abstract information from a noisy input data sequence* 
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CBAPTSR II 
THBOHBIFICAL ANALYSIS 

2.0 G-eneral Eemaifcg 

Before consideriag the deriration of procedares for apecifying 
classes of optimom filters, we first define the terms "discrete filter* 
and "contizmous filter", and then discass certain analogies between these 
t^rpos of filters. A "discrete filter" is a tramsmission device which, 
when sapplied with data from an external source at specified equally spaced 
moments, fornishes at the same moments ontpat data which depend on past 
input data and possihlj on past output data and time* ThB input and output 
signals axe discrete time series, and the filter itself is characterized hy 
a sequence (or sequences) of numerical weights. In contrast to this is 
the conventional "continuous filter" which is characterized hy physical ele- 
ments such as resistance, inductance, and capacitance, and whose input and 
output are usually continuous time series* 

A linear filter, he it discrete or continuous, is one for which 
there is a linear relation between input and output* This relation may he 
expressed "by recourse to the superposition principle. Confining our atten- 
tion for the moment to linear systems, we remaik that many of the concepts 
of conventional filter theory have their analogies in the discrete domain. 
For the constant-coefficient continuous system (Pig. 2.00-la) , the input- 
output relation is defined by the superpositipn integral i 

-t 



f^(t) = J^ h(t -r)fi(2')«L'^ (2.00-1) 

—GO 

rm 
== / li(2r)fi(t -t)d7r {2.00-la) 




a. CONTINUOUS FILTER 



^ ^h 



Ck 



b. DISCRETE FILTER 



CO 
CM 
lO 

lO 
I 

< 



BLOCK DIAGRAMS OF LINEAR 
CONSTANT-COEFFICIENT SYSTEMS 
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vhera f.(t) = inpat si glial 

f (t) =s actual outpat signal 
o 

h(t) = response of system to a unit isipalse applied 

at t = 0« For a physically realizable system, 

h(t) = for t<0. 

To facilitate the understanding of linear systems , one frequently studies the 
equivalent frequency domain representation obtained "by Laplace transforming 
equation 2«00-la* Starting from the definition 



P^(b) ^y*" f(t)e-^* dt 



o 

we obtain 

P^Cs) = H(s)F^(s) (2«00-2) 

where E(s) = System transfer function 

Analogous equations defining the corresponding discrete system are 
easily derived* Before proceeding with the derivation we note that filters 
in which hoth input and past output data are weighted are equivalent to those 
in which an infinite number of input data are weighted. In this report we 
shall consider only those filters in which a finite mimber of input data 
only are weighted* 

Jot the discrete system (Pig» 2.00-lb) in which the coefficients 
(or elements) of the weighting sequence are invariant with time^ the ispit- 
output relation is defined by the superposition summation? 

•t. = n \ \.n (2.00-3) 

n=o 
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where k s 0, 1, 2, ••• 

s discrete time varia'ble identifyizig sample 

datum at t = t^ + KT^ 
o r 

T^ = sampling integral 

}>^ = bCkip = -bCt^ + ki^) 

= input sequence 
c^^ actual cm^^ttt sequence 
A^ = sequence of weighting coefficients 

analogous to the impulse response of the 

continuous system. 

Since the "memory" of this class of filters has "been limited to H»l samples ^ 
we might denote this as a "finite^memory" filter to distinguish it from the 
class in which hoth input and past output data are weighteda 

A frequency dOBiain representation of this system is readily obtained 
in the following maimer. We define 



GU-'^T) 'fZ \ o'^'"' 



then, inserting this in equation 2o00-39 

= 5^ A e-^^^^" y^ ^ ^-(k-.n)sTr 
^ — n -^ — k^n 

B=o k=o 

GW^'^n = (XCe-^^r) B(e-«^^) (2o00-^) 

H 

where c/.je'^^^n ^^Y^ A^e'^'^^^r 

X£=0 
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is the system transfer jEtinction defined lnj Salzer* 

The analogies hetveen the continaous and the discrete systems 9 
which are now apparent, are set forth explicitly helow* 



Continaous System 



!♦ Defined "by 

a) Differential equation 
h) Superposition integral 
2» Transfer function of constant- 
coefficient system depends on £» 
3« Impulse response or weighting 
function defined "by h(t) for 
the timeoinvariant case. 



Discrete System 



!• Defined by 

a) Difference equation 

h) Superposition summation 

2* Transfer function of constant-* 
coefficient system depends on e* 



sTr 



3* Weighting sequence defined hy A 



for the time-^invariant case* 



Por the case of the linesir time-Taryiag system, the input-output 
relation of the continuous filter (Pig. 2«G0«2a) is defined hy the more 
general form of the superposition integral 



f (t) = f h(r,t)f (tOd?r 

=y h(t -<r,<r)f^(cr)d(r 



(2»oo-.5) 



(2*00-5a) 



The corresponding discrete system (Pig« 2«0G-2h) is analogously defined hy 
the superposition summation 



M 



Ic ""^^-s n,k k«n 

n=o ' 

K— V/j JL, 2« ooa 



(2»00»6) 




0. CONTINUOUS FILTER 




b. DISCRETE FILTER 



CM 

to 

fO 
I 
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BLOCK DIAGRAMS OF LINEAR 
TIME-VARYING SYSTEMS 
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ITrom the foregoing discassion it is evident that many of the tech- 
niques of continuous filter theory may be logically- extended to discrete 
systems* Rach woz^ has heen done hy Salzer in utilizing frecfuency domain 
methods in describing and in establishing criteria for stability of linear 
conpiter programs (linear discrete filters)* Ve shall here concern ourselres 
with the discrete version of the statistical approach developed by Viener, 
Levinson, Kolmogoroff , and others* 

Turiiing our attention now to non-linear systems, we note that i& 
such systems, there Is a non-linear relation between input and output* In 
other words the characteristics of a non-linear filter depend on the signal, 
and possibly on time* If we restrict our investigation to "finite-memory" 
systems, we see that a fairly general non-linear program may be described 
by the equation 

*k= 4, ^n,k \-n + ^ \,k \-m 
n^o tusQ 

■K ~ U, 1, 2, • • «- 

If the performance of the program is time-invariant, then 

A == A 

n,k n 

B , = B 
m,k m 
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In the following sections we sball develop procednres for specify- 
ing the optinBUB (mean-square error) discrete finite-memory filter whether it 
be lineetr or not, and whether its input he stationary or not* It should he 
noted that the specification of the corresponding infinite-memory filters 
can he derived hy methods entirely cmalogous to those to he set forth helow* 
However 9 an approximation is needed to make the problem manageable, so that 
the resulting design is not strictly an optimom* 

2*01 Synthesis Procedure for Linear Time-Invariant Programs 

The linear time-invariant program is characterized by the input- 
output relation 

nr=o 



k = 0, 1, 2 



, • • • 



Since the actual filter output c may differ from the ideal, or desired, out- 
put a^, an error Su. will be present (see Pig» 2»01-l)» 

Levinson has developed a sinrple computational procedure for dis- 
crete filter design which is applicable to the problem at hajid. This pro- 
cedure is, essentially, classical least squares with the additional specifica- 
tions of linearity of the filter and stationarity of the input sequence* Re- 
capitulating (with some changes in notation) part of Levinson *s procedure, 
we now determine the nature of the linear program which, with input b, , will 
have an output as close as possible to the desired output a. . An: error 
quantity 
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may therefore l)e definedo Oar proMem is to derive a sequence of weights 

▲. sTiCh that we minimize 
n. 

= Mean 8q\i£a*e error for linear constant-coefficient filter 
or 



- lim 

^Ic N -^ 00 aJ+1 



1 ^^ o2 _ \ . lim _1_\ ^ T, 

rar ^V- ^ " ^^- a H -» flD aiW-l ^h-- Tc Vi 

E n & 



Z^UVa ^^i" ^11^ Vn^to^ (2.01-U) 



Thas far the derivation differs in no respect from classical least squares. 
If the structure of the series is such that stationarity (or at least quasi*^ 
stationairitj) is assured^ we may now define the auto- and cross-correlation 
fanctionss 

1 ^ 

E^^ (k) = 1^ ^ Z: a^ Vfc 

Equation (2«01-^) assumes a simpler fozm when we introduce the correlation 
fane t ions* 
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Using the techniqiies of the calculus of Tariatio&s^ we determine those 
values of k which minimisse I, "by setting 

K A^ = for k = Oj 1, CO 9 M« 

^\ 

This yields the Kiener-Hbpf equation for the linear finite-memory filters 

n=o 

where k = 0^ 1^ •»•» Mo 

For a predicting filter^ the Wiener-Hopf eijaation is of the foimg 

H 

ZI ^,\h(^-^) = %a^^«) <2o01-7) 

n=50 

where s = integral nxunher greater th^n 
zero denoting the ntimher of 
sasrpling periods in the future 
hy which the prediction is made* 

The miniffium value of I- for prediction is then 

min n=^ 
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LeTinson has shown that increasing M always improves the quality 
of the filteringo 

In order to synthesize a linear finite=»meiBory prograa as a se- 
^ence of Mfl weights , A , one need merely solve the Viener-Bopf equations 

(2.01-6) or (2»01-7)<» In treating the concept of stability for this class 

17 
of filters, we follow Harewicz ' hy defining a filter as stable if to a 

hounded iztpnt there always corresponds a hounded otLtpat« A necessary and 

sofficient condition for stability is absolute convergence of the weighting 



sequence (A ), that is 



n' 



M 



n=^ 



Slp.ce we treat only those filters with finite **memories^9 the only condition 
required for stability is that all A be finite© 

Just as the stability of continuous linear systems can be studied 
in the frequency domain^ so can one make a similar study for discrete linear 

systems. It will be founds in general ^ that the class of filters under dis= 

«««s5P 
cd.88ion always have a transfer function in the form of a polynomial in e ^« 



It is instructive to follow through the derivation for the simple 
case of prediction when H = 2 in order to observe the effects of iisposing 
the specification of stationarity 



k = Vs - ^W * Vk-l + Vfc-2> 



It -. ~ lim 5aT ' i".i" ^ r* 1 

1® nt ^^ 2^3. fc-w '^fc 
IS °r fiD «?»-ir 
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'l" ^ /?-. ^ 



Z»L*-oD^4Z^L-IE^i 



2 
k-2 



'^oY>b\ - 2HX^\f bVi - 242Z>s\.J 



+2iL 



.^z^ 



W-i * 24/2 



& 



-fc.2 • 2»lA2X.Vl\-2 



b, - + 



If the segaence b. is statlonaiys tliea 



lim 
S— *oo 



ssa 






H 



Wl 



H "^ © 



1 
25«-l 



k 



Vl V2 . 



k»2 



Introdacing the correlation functions 5, we have 



^ic = V(°> + (^o -^ 4 * 4> \t,(°> * ^^oh * Si^A^) 



* 2ie^2^1,(25 - ^ohJ'^ - 2*lV^*"5 - 2i2^a(^»> 



Imposing the conditions for the minimi 2sat ion of I. we ohta^-B the s^rstem 



of equations 



di 



Ic 



dA 



= 



dt 



ic 



di 



ic 



a^; 
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If stationarity of the inpat can not lae assomed, then one can 
not define meaningful correlation functions on a tiiBe->aTeragiBg 'basis. Ve 
BOW consider the eg^ roach to be followed when the input is nonstationary. 

2*02 Synthesis Procedure for Linear Time^yarying Program 

An optiffluja filter for the case of the non- stationary ensemhle 
can be specified in a manner quite similar to that of the stationary, er^ 
godic ensembloo If the statistical characteristics of an ensemble are 
timO'-'Tarying, then a filter whose peirformance is optimized on the basis 
of these characteristics will, in general, be time-irarying. Thus the ele» 
raents of the optimum filter will be a function of the sampling instant, ^, 
at which the processing of the input data is to occuro 

Let a_, i_ ~ desired output datum for the rth member of the 

r,K -= 

ensemble at the kth sampling intervals 

b , = raw input datiim for the same member at the kth 

sampling interval « 

A , = coefficient by which b , is to be weighted at 
n,& r,js><=>n 

the kth sampling interval. Note that the same 
A. y_ applies to each of the r members of the 
ensemble o The linear combination of the weighted 
input data then forms the actual output of the 
filter for the rth member at the kth instant« 
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C , =: actual outpat for rth mem'ber at kth instant. 
r,K: — 



M 



" 2_- ^n,k \,k-n (2.02-1) 



n=o 



W® now define an error quantity 



^r,k *r,k " *'r,k 

(2o02-2) 
M 



^r,k " 2_. n,k r,k=n 
' n=o ' ' 

and seek to minimize the mean square of this error, Ii«.(^) » over the en= 

semhle of signals. 

N p 

I^^(k) = lim 2Z 3P(r) 6j.\ (2.02-3) 

H -^ 00 r=-N ' 

where p(r) = prohahility of the rth memher. 

Substituting the expression for the error quantity into equation 2.02=3 
and expanding, we obtain 



= lim XIpWa; 



(2.02=H) 
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Since the signal ensemlale is not ergodic, we can no longer justify the equating 

of time-averages with ensemhle-averages. We therefore introduce the concept of 

the ensemhle-averaged linear correlation functions. The autocorrelation is 

defined as 

N 
R (k,k-n) = lim Y" p(r)a ,a , 
IT -> 00 r=~fl ' ' 

and the cross-correlation as 

H 
R^^(k,k-n) = lim JZ P^^)^r,k^r,k-n 

In the ahsence of specific information as to the distriTw.tion of the members 
of the ensemhle, one might assame that the occurrence of all members is equally 
likely. Physical consideration may frequently justify this assumption. On 
this "basis the mean square error for the linear time-varying filter becomes 

1 ^ 2 
I^^(k) = ^lin^ 5^ ^ er,ic (2.02-3a) 

into which we insert, after expanding, the corresponding correlation functions 



and 



E^(k,k-n) = llB ^ / »r,k*r,k.n 

H -> 00 r==-N ^ ' 

R.^(k,k-n) = lim =~=- > a ,b . 
Da ' -ur V _ 2N+1 -^-ET r,k r,k-n • 

The resulting equation, irrespective of the probability distribation p(r), is 

n ' 

(2»02-=5) 

^JLH \,k\,k^b^^"^'^-^^ 

T» in 7 7 



n m 
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Now minimizing with respect to the A , coefficients 

^ 7 =0 f or j = 0, 1, • . . M 

ve ohtain the synthesis equations for the linear time-varying filter 
M 



n=o 

for J^Oyl) o«o tf» 



(2.02-6) 



For a predicting filter, the synthesis equations are 

n=o ' 

for J =^ 0, 1, «a« H« 

and £ is as previously defined 

It should "be noted that the ahove-descrihed procedures yield a 
specification for an optimum filter at a particular instant — the kth 
sasrpling instanto fhe approach which one takes in specifying a time-vary- 
ing discrete filter depends on how rapidly the statistical characteris- 
tics of the signal ensemMe are varying compared to the time constants of 
the controlled system. 

If the variation is slow cosipared to these time constants, then 
one can solve the filtering prohlem on a quasi-static basis, fhe solu<» 
tion involves a filtering system consisting of a set of optimum filters 
(each operating over a certain time interval) and a device for switching 
f2H>m one filter to another* The switching device might he an electro-^ 
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meehanical relay network, or a conditional sa'bprogram instruction to the 
coBipnter (sach as the cp(-)z instmction in the Whirlwind I code). The 
switch night he actuated hj information as to elapsed time and/or as to 
the quality of performance of the output of the filtering system. Such a 
system, inrolving P separate discrete filters, is shown in Fi^re 2.02-1. 
There it is assumed that each filter operates well over a range of + H 
sanrpling interrals about some sampling instant (at which it is optimum). 
The number of filters, P, would obviously depend on what tolerance in 
quality of performance were permitted, and on the price one were willing 
to pay in storage facilities, program complexity, and cosrputation time. 

Jot the ensemble whose statistical structure Taries too rapidly 
for effective filtering on a quasi-static basis, we need merely extend 
this approach to the limit and proTide a different filter at each saiopling 
instant. Bather tban evaluate and store the large nomber of sets of 
weights directly, we might, as before, synthesize P separate discrete fil- 
ters. After plotting each of the A , weights as a function of the time 
variable k, we approximate numerically by a smooth curve (see Figure 
2,02-2) each of the K+1 discrete sequences. The desired set of weights 
at any sampling instant is then obtained by interpolation. The storage 
requirements are reduced since we now store only the coefficients speci- 
fying the jEtinctional approximation of any A. . as a fbinction of k. Thus, 
if Lagrangian interpolation were used, the set of M-t-1 polynomial ap- 
proximations would yield the optimum filter at each of the original P 
sampling instants, and an approximation to the optimum filter at any other 
instant. In this case. 
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FILTERING SYSTEM FOR A SIGNAL ENSEMBLE 

WHOSE STATISTICS VARY SLOWLY COMPARED TO 

THE TIME CONSTANTS OF THE CONTROLLED SYSTEM 
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APPROXIMATION OF THE WEIGHTING COEFFICIENTS 
AS FUNCTIONS OF THE TIME VARIABLE k 
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^►.fc = ^ + <^01 •= + 0^02 "^ * — "^Of "^ 



^,k = °<10 + '^ll "^ * 0^12 "^ * - 0<ie '^ 



V " "^MO -^ ^ ■= * ^M2 "'^ * - <?<Hh '''' 



where the degrees of the polynomial (f ^ g, ••• h) need not necessarily 
he the same* Since the device controlling the position of the switch 
woTild under these circumstances he time-^ctTiat^, we mast necessarily 
have insight into the manner in which the statistical stnicture of the 
ensemble is varying in time* 

2*03 Synthesis Proeedare for Hon-Linear Time-Invariant Programs 

Ve consider now that class of discrete filters for which there 
is a non->linear, time-invariant relationship between the output data and 
a finite number of inpfut data* In particular, we treat the simple case 
defined by 



\ =2Z Vk-n * i =p "U (2.03-1) 



n=o p=^ 
Accordingly, we define an error quantity 



n=o p=o -^ ^ 
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and proceed to deriye sequences A and B sach that 



'-%""<» ^ &^ 



= mean square error for a non-linear 
constant-coefficient filter 



is a miniioam* 



I ^ = lim =is- ZZ «^ + III 511 K^ ^^^ oS?r ZZ K «K « 
a -f 00 k n«o ibf=o a -r oo k 



"^'i-^Vd^^^ ^ Zl^k-p^k-q 
p=o q=o ^ ^ IS -^ m k -^ ^ 

jj (2.03-3) 



- 2 y~ B lim Tsfer T a, tf 
JL r^ 



+ 2 



E> A B lim Ts-s-y^ \ ^f 
^ — n P w V 2ir+l ^T- ' k-n k-p 
i]r=Q p=o *^ N -^ 00 ^^k ^ 



If stationarity can "be assumed, the concept of time-averted correlation 
functions may he introduced again and extended further* In addition to 
the previously defined linear functions, we shall have occasion to utilize 
the higher order non-linear functions. 
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«bi»<^' V = /4-„ ^ ,5 -b Vk, Vk 



a 'y CO 






Because we shall always l}e dealing with powers of the input 
seqaence, some of the shift s, k., will he equal , and hence the coisplete 
generality of the ahove defined functions is unnecessary. These higher 
order functions may under such circumstances he redefined in terms of 
linear correlations hetween the powers of the irpit. For the case treated 
here, considerahle siinplification is obtained if we let 

%-n k-n 



Then 

I 



nc 






p^^ Pi 66 






(2.03-^) 



Minimization of X reouires that 
nc ^ 



-gj~-=0 h = 0, 1, ... M 
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-5^=0 J =0, 1, ... q. 

"From these relations \re obtain the Wiener-Hopf equations for the non-linear 
constant-coefficient filter: 

n?=o p=o -"^ ^ 

h = 0> ly ... M« 
j = 0, 1, ... q. 

For a predicting filter, the Viener-Hbpf equations are of the form 
M a 

Z];=0 pa=0 

h ^ O9 ly ... M* 
j = 0, 1, ... <l. 

The minimaia value for I for prediction is then 

nc -^ 

(2.03-7) 

Because of the non-linearity of the filters, conventional fre- 
quency domain analysis is not applicable. Hote that the extension of the 
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above derivation procedure to any order of nozt-linearity is rather easily 
made* 

Illttstrative Sxample 2.03-1 

As an illustration of a case in which a non-linear weight- 
ing sequence is superior to one which is linear, we consider the prohlem 
involved in predicting flux linkages, y^(t, i), in an electrical circuit 
composed of an ideal capacitance £ and an iron core inductance L* The 
prediction of future values of flux linkages is predicated upon a knowledge 
of past values of J((t^ i) and of the physical mechanism governing the 
variation of ^{t, i)» Because of the fact that the core is suhject to 
magnetic satuxution, the flux linkages depend on the amplitude of the 
current i, as well as on time t. The condition of equilihrium of electro- 
motive forces in the circuit (Kircboff *s voltage law) gives 

If + i/ iat = o 

O 

Por the linearized version (an approximation of zero order) of the proh- 
lem, we have the familiar relation 

dt ^ dt 

As a first approximation however, we can assume that the condition of 
saturation can he expressed hy the equation 

i = AAi + BA ^ 
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Sa'bstitating this expression in the original equation aad differentiating, 
we have 

We now approximate this equation descrihing a non-linear conservative 
system hy means of a difference equation* 



,2 f^n A n 



OP 



where 



.-,3 



Atl = %K * °1 An-l + VAn (2.03-9) 



Co = <2 - ^) 

0, = -1 

2 

^o C 

Information as to the values of A and B, and hence as to C and B , is 
contained in the normal magnetization curve for the inductor. 
Let us assume that our prohlem is the following! 

A sequence of data relating to the time- variation 
of flux linkages in the inductor is to actuate a control 
system, These data, having been experimentally observed 
by instruments capable of measuring only to within +0.01 
flux lizikages, are thereby contaminated by quantization 
noise* Not only shall our programs operate so as to 
reduce the effects of noise, but, in addition, we shall 
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require that they are to improve the overfall system response 
by acting as lead or predicting networics. We shall farther 
assojne that^ hecaase of limited storage facilities, we can 
allocate only two registers to retain the weighting coef- 
ficients of a programo 

Oar prohlemj, in essence^ is to devise an optinnim two- 
element predicting filter for the processing of cjuantized 

Physical insight into the indactor-^capacitor network has provided 
as with a mechanism for generating fa tare values of the ^. sequence from 
past data =« equation 2aG3«='9o By utilizing the knowledge of the mechanism 
for the formation of the 1. sequence j, the designer should he ahle to pre- 
dict more intelligently fature values of this sequence* In the absence 
of quantization noise^ one would logically use the recursion equation 
2»03~9 in his predictiono The introduction of noise g however ^ produces 
a sequence of perturbed data;, b, j for which the recursion equation is no 
longer valid* To predict future values of X^ from present and past values 
of the perturbed sequencej, one must introduce a smoothing mechanism (e«g«9 
the least squares procedure) o If^, in specifying the characteristics of 
the filter 5) the designer were to constrain the system so that it per- 
forms only linear operations;, he would not use all of the available know- 
ledges and the performance of the filter would consequently be poorer» 

To test the vfididity of this reasoning;, we shall determine ^ by 
means of classical least squares ^ a two-element predictor of each type -^ 
linear and non-1 inear. 
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Suppose 9 for a gi^en test signal 9 we use two different flujoneters 
as measuring deiricese One 9 capaMe of indicating ^.0001 flux linkages 
is primarily suited for lal)oratory work? the other 9 capable of indicating 
-K)«01 flux linkages is rugged enough for control purposes. The former 
supplies a f\, sequence; the latter ;, a h. sequence. For this test signal 9 
these sequences happen to he the followingg 
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OoOOOO 
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100729 


1.07 
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1 03666 


1-37 
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ioS36^ 
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2 07290 


2o73 
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Linear Predictor 

We define 



U + A,h 



and seek to minimisse 



CA 



kf s " "^0^ ~ W= 



The minimization procedure leads to the system ©f equations 



A.. > K + 



k+s k 



O ' , £ 1D=X 



h 



k'^l 



h. 



^k4« k-1 
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For 8 = 1, we have 

(Uo9.02g6) A^ + (122.9^6^) ^ = 122.9156 

(122.9^^^) A^ + (^9.0286) A^ = G3.hS3S 

From which 

A^ = 0,27^22 
o 

k^ = 0.087^5 

Hon-'Linear Predictor 
We define • 



and seek to minimize 



3x2 



s„-, = 2_ ( A,^ - A„v - B„i>i) 



lal /— - ^^krfs ok ok' 
The minimization procedure leads to the system of equations 

A y~ 13? + B y~ V = y~ x^^ v 

o ^ k ^ k /^ /^kf 8 k 

o A^ k ^ \p ^ ~ k" Ak+s k 

•■> 
For s = 1, we have 

<i+09*02S6) Aq + (135 813-^37) \ = 122,9156 
(135 a3.^37) \ + (^9 9^7 362,50) B^ = 2716.3631 
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Prom which 



A^, = 3.07939 



B =-0.00837 



We now test these filters "by attenipting to predict Aj-a.-i 
when given h, and h, - for the linear filter or "b, for the non-linear 
filter. 

We define 

^^^ = predicted value for ^, "based 
on quaint i zed data 

^ k "^ X k - Akp 
The results are summarized helow: 



Table 2.03«1 - Comparison of Performance of 
a Linear and a Hon-Linear Predictor 
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-2.881*9 
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g.2361* 


-3.1076 


Ag 


1.61*55 


17.5205 


li*,6673 


4.1*987 
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It is not surprising to find the performance of the non-linear 
predictor to he saperior to that of the other since the data sequence is 
derived from a non-linear generating mechanism * This superiority is made 
evident hy a comparison of the sums of the squares of the errors in pre- 
diction for the two filters. Thusj, for the linear predictor 

s^ = 333 "052^ 

and for the non-linear predictor 

^nl "^ 52.^260 

In spite of the fact that the non-linear predictor used a smal- 
ler "memory" (one datum) ^ its performance was definitely superior^ 
Clearly, the availahility of a priori information as to the nature of 
the system enahled us to design a "better predictoro 

In order to visualize more clearly the mechanism of predicting 
a perturhed sequence 9 one might resolve it into the mechanism of pre~ 
dieting in the absence of noise and that of smoothing a perturhed sequence. 
It is not our purpose to imply that these are not interrelated processes j 
hut rather to suggest this artificial separation as an aid to the imagina^ 
tion» It is then seen that the former mechanism should approximate that 
of the recursion equation as closely as possihle^ while the latter should 
he as effective as possible in removing the undesirable perturhationso 
This reasoning provides us with an insight into the manner in which the 
particular type of non-linearity should he choseuo We may conclude that 5, 
when the mechanism for generating the true data sequence is known or sus= 
pected to he of a particular type of non-linearity^ one should design the 
appropriate non-linear filtero 



Report R-.210 -lf2= 

2»0^ Synthesis Procedure for ITon-linear Time-Varying Programs 

The last procedure to "be discussed is that in which the filters 
process the non-stationarity input data in such manner as to establish 
a non-linear, time-varying relationship hetween the output and a finite 
number of input data. Since the extension to higher orders of non= 
linearity is ohvious^ we again treat the simple case defined "by 



n,k r,k~n Z___. p,k r.k-p (2»0^1) 

I)=0 



The error quantity is then 

M 



2 



3??*c r,k .^. n,k r,k-n ^— p^k r^k-p 



and if we substitute 

^r,k-p rjk=.p 
we obtain as the ensemble-averaged mean square error, Ij,«.W > ^or the 
non-linear filter 

r=N 



2 



i,,(k) = lim XI ^M e 

-4 CO r^-Ii 



If the assumption of equal likelihood of occurrence of the member signals 
is justified, then we may write 
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MM ^ 

tt=o iB=o ' ' H -> » r 

"•^ S ^ ^P,A.k /il a^ 21 Sr,k-p^,k-q 



2 2 ^n,k ^^^^ 2m 5^ *r,k \,k«n (2.olf-3) 



2 ^ ^p,k ,^i^^ 2m 11 ^,k Sr,k-i 



p=o ^^ N "^ OD 



^^ 4_ ^n,k\,k -■'■^ 2IH-1 Z_ ^r,k~p\,k- 



Subs ti tat ing the ensem'ble<»averaged correlations 



n * 



■2> =p,k v=.>=-p) 



+2 1:^ \,k'p,k\e('^-P''^-'') 



Minimizing I (k) with respect *o A^ j^ Bud. ^^ ij. we o"btain the synthesis 
equations 
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^ ^n, Ab<^-^>^-^) -^i Bp,kV^-^'^-^^ = V^^'^-^) 



V 



M 






31=0 



p=o 



h = Oj 1, ••. M 



(2.0M) 



Ibr a predicting filter, the equations are 






M 






^^.kV'^-J''-") +^B^,feRgg(>c-J,k-p) =Rg,(.^.fc-J) 



p=o 

j = 0, 1, •., q 



(2»oH-.6) 



The minifflam value for I (k) for prediction is then 

nv 



^nr^^l 



M 



min 



^aa^^'"^) - U ^, Aa^^"'"^-"' 



n^ 



IZ \,^^gf,^^'^-^^ 



p=o 



(2,0H»7) 



Report R-21G -^5- 

2.1 Q.UANTIZA3!I0]Sr BOISE AITO mJMERICiAL COMPUOHAJIOSS 

Although varia'bles encountered in control systems are frequently 
continuous, a computer such as Whirlwind I requires incoming data in the 
form of discrete quantities* The digital control of a continuous variable 
consequently involves a conversion from continuous data to discrete (i.e., 
encoding) when raw information is supplied to the compater, and from dis- 
crete to continuous (i.e*, decoding) when the processed information is used 
to actuate the control mechanisms* It should he noted that the conversion 
problem is not peculiar to real-time computer applications, hat rather that 
it is rendered more difficult by the necessity for obtaining, within a very 
limited time, results which are usable in controlling a dynamic system. 

Kot only does the original input signal to the encoder contain 
noise from sources both external and internal to the control system, but the 
very process of encoding further corrupts the signal so that filtering of 
the output sequence is necessary if we are to extract the true message or some 
function thereof* As has been previously indicated, this distortion results 
primarily from the quantizing process and manifests itself, in the time do- 
main, as a limiting of the the namber of digits by which the variable may be 
represented. When these sharply limited data are used in numerical compu- 
tations, we frequently find that a previously stable program will yield either 
a divergent or an oscillating solution. 

In the following sections we shall discuss the characteristics of 
qiiantization noise and how they affect the ever-present round-off error in 
mimerical procedures. 
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2»11 Analysis of Noise Caased "by the Encoding Process 

Since later sections of this report deal with the design of linear 
programs capahle of smoothing and predicting fatare values of an encoded 
sequence, it is appropriate that we now examine the encoding process in some 
detail • Incoding may "be defined as the process wherehy a continuous time 
series is converted to a discrete time series, the elements of which can 
assume only a finite set of discrete amplitudeso This conversion in- 
volves the separate, "but commutative, operations of sampling and quantizing. 

Since the signal distortion caused "by sanipling can frequently 
be made negligibly small "by proper design of the encoder, we shall make 
only a few "brief remarks about this operation. Referring to Figure 2. 11-1, 
we see that the sampling device can be represented by a switch rotating at 
a constant angular frequency m , followed by a holding circuit and an 
amplifier. For a continuous signal input f(t), the sampler provides at 
its output a sequence of pulses, f, . All pulses are of equal duration T, 

and successive pulses T seconds apart* Insofar as information content 
is concerned, there is no distortion provided only that the sanipling fre- 
quency is at least twice the highest signal frequency. 

21 
Linvill has shown that the sampling operation may be visualized 

as the process of modulating a continuous signal by an infinite train of 
unit impulses. The resulting sampled signal has a spectrum which contains 
the original frequency components as well as all harmonics of these com- 
ponents. So long as the condition on the sampling rate is met, there will 
be no overlap of the spectra and hence no distortion. 

Withih a finite range of amplitude variation, a continuous sig* 
nal, as well as its samples, can assume an infinite number of amplitude 
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levels. However, it may be neither possible nor necessary to transmit 
the exact amplitudes of these samples because of various limitations im-> 
posed by the transmission device or by the ultimate receptor. In such 
cases it is permissible to represent and to transmit all levels within a 
certain amplitude range by one discrete amplitude level. This means that 
the original signal is to be replaced by a wave constzucted of quantized 
values selected on a minimum error basis from the discrete set available* 
Clearly if one assigns the quantum values with sufficiently close etpacing 
one can make the quantized wave indistinguishable from the original signal. 

The quantizing process may be visualized as being the result 
of operating on the signal with a "staircase transducer", a device having 
the instantaneous output-input characteristic shown in Figure 2.11-2. 
When a smoothly varying signal is the input, the output remains constant 
while the input varies within the boundaries of a tread, and changes 
abruptly by one full step when the signal crosses the boundary. A quan- 
tized signal wave and the corresponding error wave are shown in Figure 
2.11-3. 

The quantization error is, then^the inherent amount of distor- 
tion resulting from the fact that the output of the encoder is limited 
to a finite set of amplitude levels while its input occupies the same 
amplitude range in a continuous manner. The maximum instantaneous value 
of the error is half of one step and the total range of variation is from 
minus half a step to plus half A step. Only if the input is known as an 
exact function of time can one find an explicit relationship between it 
and the corresponding error. Otherwise, one must resort to a statistical 
description of the error q(t) since all that is known, in general, is that 
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it is a function of aniplitade of the quanta, oi., and the signal function 
itself. 

q.(t) = P[o<i, :f(t)] . 

One can he somewhat more explicit when the quanta are equal and write 

q(t) = f (t) - (Xk(t) 

where k(t) is a discrete variable taking on a range of integral values in 
such a manner as to render [q(t) a minimum. 



f 



»n 
•n+1 



p(-n) 
p(-nrl-l) 



:t) = 



p(-l) 

p(0) 

P(+1) 



n-l p(n-.l) 

\^ n p(n) 

and p(h) = probability that f(t) lies in Jo((h » l/2)j,cX(h + l/2)| . In 
the above formulation it is assumed that the signal is bounded in amplitude 
by (-nc?<. , n (X ) so that there are 2nr*-l quantizing levels. Unless the 
probability density distributions of f (t) and k(t) are known or certain 
simplifying assumptions are made, one can deduce little of value from the 
foregoing analysis. 
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20 
If one assTimes, as did Mayer, that all errors are equally likely 

in the range (•=•1/2, +l/2), then considerahle simplification results. Let 

the quantizer have £ discrete amplitude levels and accordin^y je steps 

(X . , which need not he equal, and p(i) "be the prohahility of the i th level* 

Then one can show that the total quantization noise power will he 



With equal steps (X , one ohtains 

\~ 12 

There is, of course, no reason why the quantization need he 

22 
done on the basis of uniform spacing of the levels* Panter and Dite have 

shown that hy taking the statistical properties of the signal into con- 
sideration, the distortion introduced in a PCM system due to quantization 
can he minimized hy a proper level distribution which is a function of 
the amplitude density distribution of the signal. Non-uniform quantiza- 
tion may he accomplished "by first conrpressing the signal j, then uniformly 
quantizing the modified signal* One of the more common forms of compression 
is the logarithmic one, where the levels are crowded near the origin and 
spaced farther apart near the peaks* Panter and Dite have also shown that, 
with logarithmic conrpression, the distortion is largely independent of the 
statistical properties of the signal. 

Besides studying quantization error from the statistical point 
of view, one can also investigate the power spectra of quantized signals- 
Such an investigation for both uniform and non-uniform quantization was 

19 
made by Bennett* The signal used was one having its energy uniformly 
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distri"bated throughout a definite frequency hand and with the phases of 
the components randomly distrihutedo Anticipating binary coding, Bennett 
determined the power spectra for this signal quantized to several dif- 
ferent namhers of "binary digits* As might "be expected, not all of the 
distortion fell within the signal hand. The spectra of distortion re- 
sulting from the uniform quantization of a random noise signal showed that 

(a) the fewer the numher of binary digits to which the 
signal was quantized, the greater was the noise power 
(a corroboration of Mayer's and of Bennett's analyses), 

(b) the fewer the number of digits, the richer was the 
spectrum in low-frequency components, 

(c) the greater the number of digits, the flatter was the 
spectrum over a wider range, but with a smaller 

max imam density* 
Hy increasing the number of digits (or quantizing levels) indefinitely, 
one obtains the flat spectrum of ♦'white" noise — a spectrum which is 
that of the continuous input signal « 

For the case of non-uniform quantization Bennett found that the 
error spectrum out of the linear quantizer is virtually the same whether 
or not the signal input is compressed* The advantage of non-uniform 
quantization appears to lie in the fact that finer divisions are available 
for weak signals* Por a given number of total steps this means that 
coarser quantization applies near the pe€iks of large signals, but the larger 
absolute errors are tolerable here because thay are small relative to the 
larger signal values* 
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Having discussed some of the characteristics of qaantization 
noise, we now consider its effect on control systepiSo It becomes immediate- 
ly apparent that the presence of such noise may seriously affect the stead^sr-* 
state performance of those systems having large time constants (i.e., 
systems characterized either by a good response over a narrow range about 
zero frequency or by a slowly decaying impulse response )o This stems from 
the fact that a major portion of the noise power is concentrated at those 
low frequencies for which slow systems have a good response <, Since many 
of the systems in which the digital computer will exercise a control func- 
tion are characterized by large time constants, we must devise techniques 
for coping with the problem of quantization noise when it is an important 
noise components One such technique, dealt with herein, involves the use 
of filters (which obviously need not be statistical) which are designed 
with specific reference to the characteristics of this type of noise* 

2 "12 Effect of Computational Errors in Discrete Filters 

In Section 2»0 it was indicated that weighting sequences can 
be derived from ordinary differential equations by approximating the de- 
rivatives by their corresponding divided differences • "When this is done, 
one obtains recursion formulae by means of which one can approximate the 
solutions of specific differential equations by successive extrapolations* 
It is apparent, however, that in neglecting the higher divided differences 
in equations so derived one has committed an error « Each extrapolation will 
therefjbre entail this truncation error which, if uncorrected, will tend to 
accumulate with successvie extrapolations until eventually the results of 
the computations are rendered useless* 
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The truncation error ia not the only source of uncertainty to which 
numerical procednree are euhject. The necessity of rounding off each extras 
polant evaluated with the aid of machines of limited register length will pro- 
vide another source of accidental error which tends to inipair the accuracy 
of the solution. In any nunerical work using approximate formulae and numbers 
subject to round-off, "both errors %rill co-exist independently. Since their 
presence cannot he avoided, one generally chooses his formulae and interval h 
in such a way that, together with data to a sufficient number of digits, the 
ultimate solution is obtained to the preassigned degree of accuracy* 

If one makes certain assumptions as to the manner in which each of 
these errors propagate one finds that, for 9 successive extrapolations, the 
total round-off error will grow more slowly with increasing N than does the 
truncation error* By decreasing h (sampling more frequently), one reduces the 
truncation error* However, this necessitates more extrapolations and hence 
a greater round-off error. One therefore programs his work in such a way that 
the two errors are equal at the end of the computations. 

Although the least-squares sequences obtainable through Wiener-Lee 
synthesis are not related to any specific differential equations, there is, 
nevertheless, a truncation error whenever we let H assume a finite value. 
Levinson has shown that a sequence based on M-fl weights always does a better 
job of filtering (in the specified sense) than does a sequence based on M 
weights. However, one rapidly reaches a point of diminishing returns in that 
the inrprovement in filtering resulting from the additional weights does not 
warrant the labor of computing the weights. Furthermore, the round-off error 
increases with increasing M so that , in general , a short weighting sequence 
is desirable. 
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CHJaTBH III 
EXPERIMSinFAL AKALYSIS 

3.01 Prediction in the Presence of Qp.antizatlon IToise of Fanctions 
Related to Straight-Line Hight 

To lend sahstance to this investigation, we propose to use as 
an input signal that generated Ijy an aircraft as it flies on a prescribed 
path across a polar grid. The resulting encoded sequences of the polar 
variahles, r(t, ) and 0(t,), are to he processed by the computer to give 
the future position of the aircraft. Bach of the variables will be treated 
as a simple time series, although it is possible to treat them together 
as multiple time series. 

Such sequences might well arise in an air traffic control system 
where the incoming aircraft follow a definite time and space pattern in 
their approach to the airstrip. Since the use of digital computers in such 
systems is being actively contemplated, it is pertinent that a study be 
made of programs which will permit optimum processing of the information 
by the computer. 

It may be argued that, if the path of flight has been completely 
specified by some geometrical curve, why undertake the labor of determin- 
ing a sequence of weights for statistical prediction. Wiener himself had 
the following to say about this aspect; 

"Statistical prediction is essentially a method of re- 
fining a prediction which would be perfect by itself in an 
idealized case but which is corrupted by statistical errors, 
either in the observed quantity itself or in the observation. 
G-eometrical facts must be predicted geometrically and analytical 
facts analytically, leaving only statistical facts to be pre- 
dicted statistically." 
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(Xir prolJlem deals with the statistical prediction of a time series derivahle 
from a geometrical fact "but corrapted or altered "by other time series in- 
trodnced "by the encoding mechanism. Ve desire to know what the uncontaminated 
time series will do at some fatare instants The problem has "been idealized 
to the extent of assuming that (a) the pilot is capable of flying a geo- 
metrical course in spite of air currents and other disturbances, and (b) 
the errors inherent in radar tracking of aircraft are negligible compared 
to the quantization error. The validity of these assumptions will be ex- 
amined later. 

In tackling the over-all problem of air traffic control, one 
might logically hypothesize a control system such as that shown in Pigure 
3. 01-1. The equipment lying within the broken-line boundary may be con- 
sidered as part of the digital computer. 

The operation of this hypothetical system may be described as 



follows: 



a) the detection system monitors the position of the air- 
craft and supplies information as to the path variables 
of range and bearing with respect to the airstrip, 

b) the continuous, time-varying signal related to either 
of the variables is fed to the encoder which samples 
and quantizes it, thus furnishing numerical data for 
the computer, 

c) the prediction program processes this data in such 
manner as to yield the best possible predicted value 
for an epoch 15 seconds in the future. 
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d) the actual and desired fature values of the path 
variable are compared to yield an error quantity, 
ECt). This error is evaluated and appropriate ac- 
tion is initiated to signal the pilot of any dis- 
crepancy in his flight path. 

It is assumed that the continuous output signal, f (t), will 

he sampled regularly at intervals of fifteen seconds and that a corrective 

signal, A (t), will "be supplied to the aircraft control system at the 
c 

same instants. The various quantities indicated on the diagram may "be 
defined as follows: 

foCt) = present value of path variable 

f (t) =s sampled quantized present value of path variable 

f (t) = predicted value of path variable at an epoch 
fifteen seconds in the future 

f (t) = desired value of path variable at an epoch fif- 
teen seconds in the future 

E(t) =f^(t) - f (t) 

IT 

A (t) = corrective action signal 
c 

Note that all these time series are discrete with the exception of f (t), 

o 

As a particular f (t) we shall use that generated by an aircraft 
flying a constant-velocity, constant-altitude straight line course which 
does not pass over the origin of the polar grid. This hypothetical mathe- 
matical model is shown in Figure 3 • 01-2 as well as the equations which 
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specify the kinematics of the system. Clearly these equations define 
geometrical sequences which are definitely non-stationary. Pkirthermore, 
the nature of the quantization noise is completely specified when ^-(t) 
is known explicitly as a function of time. The logical hasis for the 
application of statistical techniques, however, lies in the fact that the 
quantities V, , R , and Q(^ are random variables. 

The reader might logically question the necessity of going 
so far afield in search of an input signal to which to apply statistical 
prediction. It is a well known fact that, even if it were mandatory 
that he do so, the pilot is incapable of flying a precise geometrical 
course "because of air perturbations and becaa.se of his inherent short- 
comings as an element in a control loopo However, we again appeal to 
the physical context of the problem and point out that, in a two-dimen- 
sional control system such as ours, the only motion permissible for the 
aircraft is a coordinated turn (i.e., no slip, constant altitude). Ibr 
such turns the aircraft dynamics are characterized by a first order lag 
in which the time constant is relatively long (about 0.5 seconds) • In 
view of this fact, we may approximate the actual course of the aircraft 
by a series of straight lines. Thus, if we are able to predict a con- 
stant-velocity straight line course, we may be able to predict one in 
which the aircraft executes slow maneuvers. 

The experimental work discussed herein is devoted, entirely 
to the synthesis and evaluation of predicting filters based on equation 
3. 01-1. In this case it is evident that the ensemble of signals to be pro- 
cessed is a collection of arctangents. Examination of this equation shows 
that the random variable (X establishes the d-c level of the signal and. 
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"by reason of the circalar symmetry of the system, may "be ignored in our 
analysis. It should "be noted that ignoring 0(^ implies that we concern 
ourselves primarily with the time-varying part of any member of the sig- 
nal ensemhle. 

In the absence of specific information regarding any existing 
air traffic control systems, we are forced to make certain assumptions 
concerning the control system and the statistical nature of the signals. 
These assumptions ares 

a) the angle encoder is capable of distinguishing 
256 levels (eight binary digits) in the signal, 

b) quantization noise is the major component of 
corruption present in the signal, 

c) all velocities in the interval V . 4 V. ^V 

min^ h max 

are equally likely, 

d) all minimum ranges in the interval H $ R ^R. 
are equally likely, 

e) the maximum range of interest is R , and it 

max 

is at this range that the aircraft are first 
detected, 

f) the continuous signal, ©(t), is to be sampled 
regalarly at intervals of 15 seconds. 

Since the angular jitter in radar noise may at times be of the 
same order of magnitude as the angular quantum (about 1.^ degrees), it 
is questionable whether one is justified in assuming that quantizing noise 
is the major noise component. To simplify the analysis, however, we shall 
assume that the control system is relatively free of noise. 
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3.02 Bacperimental Results 

The esperimental results presented here are concerned with the 
design and evaluation of several linear time-varying filters whose in- 
puts are the ensemble of arctangents derivable from equation 3*01~>1 and 
whose outputs are to be future values of the signals* If, in that equa- 
tion, the quantity OjC is ignored and assumptions e^ and f (from Section 3»G1) 
are made, then equation 3<>01°°1 can be maniptilated to obtain 



0(n,J) = tan-^ {( ^ )( ^ )^ . j ( ^ )^ 1 j (3.02-1) 

where n denotes the particular member of the ensemble and j denotes the 
particular sampling instant* "By means of this defining equation, we can 
determine the value of any member of the ensemble at any saiapling instant 
for any choice of horizontal velocity, V. , and minimum range, R « The in- 
put sequences to the filters are thus angular data derived from quantizing 
equation (3 "02-1), and the desired output sequences are suigulstr data cor- 
responding to unquantized predicted values at an instant one sampling in- 
terval in the future* 

The synthesis equations for the optimum least-squares predictor 
are given by equation (2.02-7) 



n=o 



^n,k Kbl,<fc-»'J=-J) = ^a^'^^''^-^) 



for j =0, 1, •.o M 

and s = 1 
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As has been previously indicated, this system of equations yields the 
optimam weighting sequence for the kth sainpling instant. Since the statistical 
structure of this ensemble varies rather slowly, it was decided to synthesize 
a set of three filters, each of which operates over a certain range of the 
discrete time variable, k. In order to verify experimentally the fact that 
increasing the number of elements in the filter improves the performance, we 
have designed three such sets of filters — for M = 1, 2, and 3* 

The task of designing a filter is seen to be two-fold. First, one 
must calculate the correlation fanctions; and second, solve the system of 
simultaneous equations (2.02-7) • ^he latter is relatively simple and can, 
for small M, be done by hand computation if necessary. The computation of 
the ensemble-averaged correlations, however, is a formidable task, even by 
automatic methods. It is rendered manageable, in our case, by the specifica- 
tion of the explicit form of the signals by equation (3. 02-1) and of the 
probability density distributions by assumptions £ and d. Under these cir- 
cumstances the task can be mechanized by coding a computer program for the 
coniputation of the autocorrelation, R., (k-n,k-j), and the crosscorrelation, 
R. (k+l,k-j), for appropriate values of the arguments. 

Having computed the correlations and inverted the matrisp equation, 
we can then evaluate the performance of each of these three sets of filters 
by observing its performance on various sample signals. The filter is sup- 
plied with quantized data 
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It will then wei^t and comMne these data to yield an output 

which is the hest possible approxiination, in the least square error sense, 
to the true predicted value, ... • 

The foregoing synthesis and evaluation was made, and the results 
are presented helow. The accoBipanying graphs show the distrihution of 
frequency of error in prediction as a function of error in prediction for 
each of the three sets of filters. Some additional figares which give a 
measure of the quality of performance are^given in the following tahle* 



Type of 
Predictor 


Average 
Error 


Standard 
Deviation 


Mean Square 
Error 


Percent of Samples 
Having less than 1*^ 


Maximum 
Error 


3- element 


0*326° 


0.80)*° 


0.750 deg^ 


i5i 


s.^G" 


3-element 


0.238° 


0.735° 


0.598 deg^ 


i3i> 
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^--element 


0.229° 


o.66o° 


O.k^k deg^ 


mi 


2.50 



These results are "based, for each set of predictors, on 210 
sample signals drawn from an ensemhle which included signals in addition to 
those in the original ensemhle. Note that no special significance is to 
"be attached to the tolerance value of one degree. This is merely an arbi- 
trary "basis for comparing these filters with each other and with the quantiz- 
ing unit in azimuth of 1.4 degrees. It should "be noted that there is a 
distinct improvement in performance as the nam*ber of elements in the weight- 
ing sequence is increased. Additional experiments made for signals with 
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arbitrary non-zero jj^ show that these filters perform nearly as well for 
these as for members of the original ensemble so long as we do not attesipt 
to predict across the discontinuity at = + 180 • This, however, is not 
a fault inherent in statistical filters, bat results from the fact that the 
signal is a malti-Talued function. The predicting filter can be rede- 
signed to cope with this discontinuity. 

The experimental results summarized above indicate that rather 
good performance may be expected from relatively simple filters. Although 
improved performance can be expected from more complex filters, the 
net increment in improvement may not justify the additional computational 
labor and storage facilities. 

It should be noted that our synthesis procedures are not re- 
stricted to predicting filters only, but can be employed to derive any of 
the compensating filters which are so frequently used to improve the per- 
formance of a servomechanism* 
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CBAPTSR 17 
CONCLUSIONS AND SaCOSSTIONS FOR FORTHER STUDY 

As a result of the theoretical analysis emhodied in this in- 
vestigation, the statist icfJ. comnninication theory developed "by Wiener and 
Lee has been extended to the synthesis of real time coispater programs. 
The accompanying experimental design of certain linear predictors has 
established the validity of this synthesis procedure. The extreme flexi- 
bility of the digital computer, however, permits, with equal facility, 
the design and application of either linear or non-linear programs (i.e., 
discrete filters). 

Although the extension of this theory to the problem of dis- 
crete filtering mgQr be considered a step in the reduction of the art of the 
computer programmizig for certain applications to an exa^t science, much 
work remains to be done in the further utilization, in the discrete do- 
main, of the concepts formulated by Wiener and Lee. Some of the more 
promising subjects for investigation involving a union of statistical 
communication theory and digital computer practice are the following? 

(a) the development of discrete filters capable of dealing 
with multiple time series. Such filters might be espe- 
cially useful in industrial or chemical process control* 
where it is desired that the digital conrputer control 
the behavior of the several interdependent variables 
which determine the quality or quantity of the end pro- 
duct, and 

(b) the investigation of methods for determining what types 
of non-linearities, if any, should be incorporated in 
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a filter for a specified 611861111)16 of signals. 

The above mentioned union of statistical theory and compater prac- 
tice offers certain other advantages which should not he over-looked, 
miereas the design of an analog network hy the Wiener-Lee theory involves 
the solution of an integral equation,: the corresponding design of a dis- 
crete filter involves only the solution of a set of linear simultaneous 
equations. Hot only can the digital conrputer he programmed to evaluate 
any order correlation function (suhject to limitations of storage facili- 
ties) , "but it can he programmed to perform the matrix inversion required 
for the determination of the weighting sequence. Regardless of whether 
the filter is to he linear or not, our problem always involves a set of 
linear equations. 

Because of the conrparative virginity of the field of discrete 
filter synthesis it was not possible to make any conclusive comparisons 
between the performance of this class of statistical filters and those of 
other classes of filters. We are, however, justified in concluding that 
the synthesis procedures developed in this investigation lead to sensibly 
designed filters which are entirely aware of the characteristics of the 
noise (which is always present in any real signal) as well as those of the 
message. We may farther conclude that, when the generating mechanism for 
the ensemble of signals is known to be non-linear in nature, the perform- 
ance of an appropriately designed non-linear filter is definitely superior 
to that of the linear filter. We note that the specific kind of non- 
linearity to be incorporated into the filter is of considerable importance. 

The problem of quantization noise is of such importance in real 
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time comrputer applications that a study of its statistical characteristics is 
certainly merited. These characteristics are a function of the particular 
ensonhle of signals and of the encoding mechanism. Although the experimental 
designs of this investigation show that our filters are capable of dealing 
effectively with the noise resulting from uniform quantization of the signal, 
the use of such filters does not necessarily represent the optimum solution. 
An alternative and possihly more satisfactory solution involves hoth a non- 
unifoim quantizing of the signal and a statistical filtering of the signal. 

In conclusion, we may remark that the extension of statistical com<> 
munication theory to the synthesis of digital computer programs has provided 
us with a logical means for endowing the computer with a higher order of in- 
telligence*^ 



Signed.C^^^r>f:4^Vi^. .P^^ 
Abraham Eatz ^ 




Approved 

Robert R, Everett 
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Statistical Comiminication Theory 

As a ready reference for the reader, we propose to include here a 
very "brief sammary of the essential features of the Wiener-Kolmogoroff theory. 
Por more coBiplete information, he should consult references 1, 2, 6, and 7» 

In Section 1.01 it was pointed out that the hasic concept of this 
theory is that communication signaLls are to "be treated as stationary time 
series. The stracture of such signals is frequently so complex as to render 
it irresolvable in terms of summations of periodic or aperiodic components. 
In fact, if the signals are to convey any new information to the receptor, 
then they mast he characterized "by some elements of randomness in that 
they are at least partially unpredicta'ble in advance "by the receptor. Hence 
it is seen that we are frequently concerned with stationary random time 
series which may bereither continuous or discrete. 

For random, continuous phenomena having statistical properties 
which are stationary, one may define the linear correlation fanctions as 



^^^^^ %'^,. ^ /J 'a(*)^a(*-^)' 



and 



= autocorrelation of f (t) 

a ' 



T -700 ~T 



= crosscorrelation of f (t) with f- (t). 

a 
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Xhese fanctlons have the properties that 



i^(r) = t>J.-^) 



0.„(r) = KJ-r) 



al) ^ '^"ba 



The definition of the autocorrelation function indicates a process of 

fflultiplying the function continuously hy its value at a later time,'^, and 

averaging. The even function, 0-o C'^) ? is a maximom at 2^= and is equal 

to the square of the rms value of f (t). If f (t) is nonperiodic, P^7^) 

approaches the square of the average of f (t) as 2;' increases; if f-(t) is 

periodic, (^) has the same period* Any linearly additive conrponent of 
aa 

f„(*) produces its own linearly additive component of f> „(t)« A composite 
^ sa 

time function can "be separated into linearly additive time functions and the 
autocorrelation of each added linearly to give the composite autocorrelation 
function. I^irthermore, these linear functions, 0oa^f) ^^^ ^ b^C")' ^^ 
their respective power density spectra are determinable one from the other 
by a Fourier transformation. Thus, 

■00 



and 



J- "-OO 



ab 

■ -00 
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Again, linearly additive components of the linear correlation jEUnctions can 
"be transformed separately and the separate transforms added to give the com- 
posite power spectra. 

This extension of Fourier theories to the harmonic analysis of 
random processes through the medium of the linear correlation functions pro- 
vides us with a powerful tool for the synthesis of linear networks which are 
optiHial in a mean square error sense. For this class of filters, the ex- 
pression for the measure of error is 



^dt 



where f (t) = actual output signal 
f ,(t) = desired output signal 

when f.(t) is the input signal. Minimization of the error expression sub- 
ject to the condition of linearity of the filtering mechanism yields the 
Wiener-Hopf equation which relates the impulse response of the optimum 
linear system to the statistical characteristics of the signal. Bstpressed 
in terms of. time domain synthesis, this equation requires that 

^id^2") = / ^(^) 0^^(r-(r)dcr for 2::^ 



where 



•00 

^..(f ) = autocorrelation of the input signal 

0. ^C'^) = crosscorrelation hetween the input and the 
desired output signals* 
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From the foregoing we see that the linear correlations are entirely adequate 
for the specification of the linear system which minimizes the square of 
the error. An entirely equivalent equation in terms of frequency domain 
synthesis requires that 

J ^^(ui) = H(a3) ^ ^^(tt)) 

An "optimum" filter designed for one memher of an ensemble of signals will 
"be equally effective in the processing of any other member of the ensemble 
having the same linear correlations. 

iB^r a logical extension of these ideas, one can define an infinite 
number of higher order correlation functions. Thus, for a stationary ran- 
dom signal 



and so forth. Since the shifts '^. are independent of each other, they may 
be visualized as orthogonal axes in the hyperspace in which the correlation 
functions are defined. Just as two dimensions are required for the geometri- 
cal representation of the linear functions, so are n't! dimensions required for 
an nth-order correlation function. 

Since it is conceivable that the reader has had little easperience 
with the higher order correlations, we include a few simple examples to 
serve as illustrations. Accordingly we choose certain of those non-*- random 
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fiinctions of time which permit a direct integration procedure. For the 
non-> random fonctions, the linear (and the non-linear) correlation functions 
are defined somewhat differently. Thus, for periodic fanctions 



'^aa^'^> = f/ ^^(^)^J^-^2rH^ 



^ah^^>=l/ f^(t)f^(t+r)dt 
o 



and for aperiodic functions 

-00 

00 
—00 



-00 



As in the case for correlations related to random seqjiiences, there are miic[Tie 
Fourier transform pairs which relate the power (or energy) density spectra 
with the appropriate correlation functions. 

CSonsider now the following examples of second-order aa to correlations: 

(a) Periodic Signal 



^aaa^n* ^g) = f / fJ^K(^^^^)fJ^^T^H^ 



Let f (t) = A cos (tat + 9) 



.3 /-T 
^aaa^^l'2"2^ ^T J cos(ujt+0)cos(ajt+eHu)t'^)cos(urt;-^94«)^'2)dt 

^3 rT 

'^wj |cos(u)t+e-HD'f' ) + cos(twt-fe^'t'p"*^&'i) 
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+ COs(0Ut+e+UJ^rttJ'^CO8(3«t+3&*tt)e'j^+U)t;2^| ^* 

It can be shown, in general that the second-order autocorrelation 
for periodic fanctions is everywhere zero. 



(b) Aperiodic Signal 






Let f (t) = 
a 



E e~* 4s.t /- 00 



-00 ^ t ^ 



■" ,-3-*dt 



^0 

In general, the aatocorr elation of any order of an exponential 
signal of the form given is also exponential in form* 

Parallel to this theory of statistical analysis for continuous 
phenomena there runs a theory of discrete phenomena. These discrete phenomena 
may occur naturally or whenever a continuous time sequence is discretized. 
In the discrete case the function f(t) of the continuous parameter t^ is 
replaced by the function f, of the parameter k, which varies by discrete 
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steps. Similarly the fanctions (2^ will be replaced by the discrete 

aa 



set of autocorrelation coefficients. 



R (k) = lim Tnrrr- XI ^^ L- 



The analog of our previous function (p (oo) %d.ll "be 

I aa 



of period 2n, Likewise one can derive "by a procedure entirely analogous 
to the continuous case the Wiener-Hbpf equation for the linear discrete 
filter, 

H ^n^h<^-^> = ^,a^^^ 
n^o 

where k = 0,l,.«.,M. 
The corresponding equation in terms of frequency domain synthesis is 

S^(<u) = g(u)) 8^^(m) 
where 

-CO 

is the transfer function of the discrete filter* 

The foregoing equations describing the linear discrete filter 
which is optimal in a least square error sense have served as the bases 
for the extension of the Wiener-Lee theory to the synthesis of digital 
computer programs. 
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